Serveur d'exploration sur la TEI

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Linguistic documents synchronizing sound and text

Identifieur interne : 000035 ( France/Analysis ); précédent : 000034; suivant : 000036

Linguistic documents synchronizing sound and text

Auteurs : Michel Jacobson [France] ; Boyd Michailovsky [France] ; John B. Lowe [France]

Source :

RBID : ISTEX:76F37F4EC8D5D4F4473AAD428436716F4418582A

Abstract

The goal of the Langues et Civilisations à Tradition Orale (LACITO) Linguistic Archive project is to conserve and disseminate recorded and transcribed oral literature and other linguistic materials, mainly in unwritten languages, giving simultaneous access to sound recordings and text annotation. The project uses XML markup for the kinds of annotation traditionally used in field linguistics. Transcriptions are segmented into sentences (roughly) and words. Annotations are associated with different levels: metadata at the text level, free translation at the sentence level, interlinear glosses at the word level, etc. Time-alignment is at the sentence and optionally at the word level. The project makes maximum use of standard, generic software tools. Marked-up data are processed using freely available XML software and displayed using standard browsers. The project has developed (1) an authoring tool, SoundIndex, to facilitate time-alignment, (2) a Java applet, which enables browsers to access time-aligned speech, (3) XSL stylesheets, which specify “views” on the data, and (4) Common Gateway Interface (CGI) scripts, which allow the user to choose documents and views and to enter queries. Current objectives include development of the annotation and software to facilitate linguistic research beyond simple browsing. Over 100 texts in 20 languages have been processed at the time of writing; some of these are available on the Internet for browsing and simple querying.

Url:
DOI: 10.1016/S0167-6393(00)00070-4


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

ISTEX:76F37F4EC8D5D4F4473AAD428436716F4418582A

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Linguistic documents synchronizing sound and text</title>
<author>
<name sortKey="Jacobson, Michel" sort="Jacobson, Michel" uniqKey="Jacobson M" first="Michel" last="Jacobson">Michel Jacobson</name>
</author>
<author>
<name sortKey="Michailovsky, Boyd" sort="Michailovsky, Boyd" uniqKey="Michailovsky B" first="Boyd" last="Michailovsky">Boyd Michailovsky</name>
</author>
<author>
<name sortKey="B Lowe, John" sort="B Lowe, John" uniqKey="B Lowe J" first="John" last="B. Lowe">John B. Lowe</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:76F37F4EC8D5D4F4473AAD428436716F4418582A</idno>
<date when="2001" year="2001">2001</date>
<idno type="doi">10.1016/S0167-6393(00)00070-4</idno>
<idno type="url">https://api.istex.fr/document/76F37F4EC8D5D4F4473AAD428436716F4418582A/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000420</idno>
<idno type="wicri:Area/Istex/Curation">000420</idno>
<idno type="wicri:Area/Istex/Checkpoint">000243</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000243</idno>
<idno type="wicri:doubleKey">0167-6393:2001:Jacobson M:linguistic:documents:synchronizing</idno>
<idno type="wicri:Area/Main/Merge">000317</idno>
<idno type="wicri:Area/Main/Curation">000293</idno>
<idno type="wicri:Area/Main/Exploration">000293</idno>
<idno type="wicri:Area/France/Extraction">000035</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">Linguistic documents synchronizing sound and text</title>
<author>
<name sortKey="Jacobson, Michel" sort="Jacobson, Michel" uniqKey="Jacobson M" first="Michel" last="Jacobson">Michel Jacobson</name>
<affiliation wicri:level="3">
<country xml:lang="fr">France</country>
<wicri:regionArea>CNRS/LACITO, 7 rue Guy Moquet, Bat. 23, 94800 Villejuif</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Villejuif</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">France</country>
</affiliation>
</author>
<author>
<name sortKey="Michailovsky, Boyd" sort="Michailovsky, Boyd" uniqKey="Michailovsky B" first="Boyd" last="Michailovsky">Boyd Michailovsky</name>
<affiliation wicri:level="3">
<country xml:lang="fr">France</country>
<wicri:regionArea>CNRS/LACITO, 7 rue Guy Moquet, Bat. 23, 94800 Villejuif</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Villejuif</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="B Lowe, John" sort="B Lowe, John" uniqKey="B Lowe J" first="John" last="B. Lowe">John B. Lowe</name>
<affiliation wicri:level="3">
<country xml:lang="fr">France</country>
<wicri:regionArea>CNRS/LACITO, 7 rue Guy Moquet, Bat. 23, 94800 Villejuif</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Villejuif</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Speech Communication</title>
<title level="j" type="abbrev">SPECOM</title>
<idno type="ISSN">0167-6393</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="2000">2000</date>
<biblScope unit="volume">33</biblScope>
<biblScope unit="issue">1–2</biblScope>
<biblScope unit="page" from="79">79</biblScope>
<biblScope unit="page" to="96">96</biblScope>
</imprint>
<idno type="ISSN">0167-6393</idno>
</series>
<idno type="istex">76F37F4EC8D5D4F4473AAD428436716F4418582A</idno>
<idno type="DOI">10.1016/S0167-6393(00)00070-4</idno>
<idno type="PII">S0167-6393(00)00070-4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0167-6393</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The goal of the Langues et Civilisations à Tradition Orale (LACITO) Linguistic Archive project is to conserve and disseminate recorded and transcribed oral literature and other linguistic materials, mainly in unwritten languages, giving simultaneous access to sound recordings and text annotation. The project uses XML markup for the kinds of annotation traditionally used in field linguistics. Transcriptions are segmented into sentences (roughly) and words. Annotations are associated with different levels: metadata at the text level, free translation at the sentence level, interlinear glosses at the word level, etc. Time-alignment is at the sentence and optionally at the word level. The project makes maximum use of standard, generic software tools. Marked-up data are processed using freely available XML software and displayed using standard browsers. The project has developed (1) an authoring tool, SoundIndex, to facilitate time-alignment, (2) a Java applet, which enables browsers to access time-aligned speech, (3) XSL stylesheets, which specify “views” on the data, and (4) Common Gateway Interface (CGI) scripts, which allow the user to choose documents and views and to enter queries. Current objectives include development of the annotation and software to facilitate linguistic research beyond simple browsing. Over 100 texts in 20 languages have been processed at the time of writing; some of these are available on the Internet for browsing and simple querying.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Île-de-France</li>
</region>
<settlement>
<li>Villejuif</li>
</settlement>
</list>
<tree>
<country name="France">
<region name="Île-de-France">
<name sortKey="Jacobson, Michel" sort="Jacobson, Michel" uniqKey="Jacobson M" first="Michel" last="Jacobson">Michel Jacobson</name>
</region>
<name sortKey="B Lowe, John" sort="B Lowe, John" uniqKey="B Lowe J" first="John" last="B. Lowe">John B. Lowe</name>
<name sortKey="Jacobson, Michel" sort="Jacobson, Michel" uniqKey="Jacobson M" first="Michel" last="Jacobson">Michel Jacobson</name>
<name sortKey="Michailovsky, Boyd" sort="Michailovsky, Boyd" uniqKey="Michailovsky B" first="Boyd" last="Michailovsky">Boyd Michailovsky</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/France/Analysis
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000035 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/France/Analysis/biblio.hfd -nk 000035 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    France
   |étape=   Analysis
   |type=    RBID
   |clé=     ISTEX:76F37F4EC8D5D4F4473AAD428436716F4418582A
   |texte=   Linguistic documents synchronizing sound and text
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024